About the Performance of HPF: Improving Runtime on the Cray T3E with Hardware Specific Properties
نویسنده
چکیده
High Performance Fortran permits to write parallel programs with much less programming effort than by using standard communication libraries such as MPI or PVM. The performance of compiled HPF programs is considered low, though. We show that a compiled HPF application will gain a substantial runtime improvement if compilation incorporates properties of the hardware architecture into the final program. Our prototype HPF compiler “KarHPFn” inserts communication primitives of the Cray T3E into the target programs. Programs compiled with KarHPFn run up to 30 times faster than their counterparts compiled with Portland Group HPF.
منابع مشابه
HPF-2 Support for Dynamic Sparse Computations
There is a class of sparse matrix computations, such as direct solvers of systems of linear equations, that change the fill-in (nonzero entries) of the coefficient matrix, and involve row and column operations (pivoting). This paper addresses the problem of the parallelization of these sparse computations from the point of view of the parallel language and the compiler. Dynamic data structures ...
متن کاملCRAY T3E and SGI Origin2000: Merging Architectures from the User’s Point of View
While the T3E is very well established as a highly parallel machine in many compute intensive environments, large Origin2000 sites still have to optimize their usage profile to get effective cycles for parallel codes even for moderate numbers of processors. The paper compares T3E and Origin2000 systems, highlighting some details with respect to parallel programming and runtime behavior of appro...
متن کاملEvaluating PGHPF on the Cray T3D/T3E EPCC-TR98-02
At present, EPCC has access to the Portland Group’s HPF compiler, PGHPF, on the Cray T3D and T3E and on our workstation cluster. We evaluate certain aspects of the compiler which are specific to user’s programs, as opposed to standard benchmarking routines. This work was done in support of the MHD Consortium (led by Dr. Alan Hood) and was funded by the UK’s High Performance Computing Initiative...
متن کاملEecient Address Translation
The address calculation for distributed data access plays a major role for the performance of ne-grained data-parallel applications. This paper reports about the hardware centrifuge of the Cray T3E which enables the shift of the address calculation from software into hardware. This shift minimizes address calculation overhead reducing communication cost of dynamic communication patterns. The ce...
متن کاملA scalable HPF implementation of a finite-volume computational electromagnetics application on a CRAY T3E parallel system
The time-dependent Maxwell equations are one of the most important approaches to describing dynamic or wide-band frequency electromagnetic phenomena. A sequential finite-volume, characteristic-based procedure for solving the time-dependent, three-dimensional Maxwell equations has been successfully implemented in Fortran before. Due to its need for a large memory space and high demand on CPU tim...
متن کامل